Search CORE

40 research outputs found

Investigating Antigram Behaviour using Distributional Semantics

Author: Sengupta Saptarshi
Publication venue
Publication date: 15/01/2019
Field of study

Language is an extremely interesting subject to study, each day presenting new challenges and new topics for research. Words in particular have several unique characteristics which when explored, prove to be astonishing. Anagrams and Antigrams are such words possessing these amazing properties. The presented work is an exploration into generating anagrams from a given word and determining whether there exists antigram relationships between the pairs of generated anagrams in light of the Word2Vec distributional semantic similarity model. The experiments conducted, showed promising results for detecting antigrams.Comment: 4 page

arXiv.org e-Print Archive

Chaotic Quantum Double Delta Swarm Algorithm using Chebyshev Maps: Theoretical Foundations, Performance Analyses and Convergence Issues

Author: Basak Sanchita
Peters II Richard Alan
Sengupta Saptarshi
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Quantum Double Delta Swarm (QDDS) Algorithm is a new metaheuristic algorithm inspired by the convergence mechanism to the center of potential generated within a single well of a spatially co-located double-delta well setup. It mimics the wave nature of candidate positions in solution spaces and draws upon quantum mechanical interpretations much like other quantum-inspired computational intelligence paradigms. In this work, we introduce a Chebyshev map driven chaotic perturbation in the optimization phase of the algorithm to diversify weights placed on contemporary and historical, socially-optimal agents' solutions. We follow this up with a characterization of solution quality on a suite of 23 single-objective functions and carry out a comparative analysis with eight other related nature-inspired approaches. By comparing solution quality and successful runs over dynamic solution ranges, insights about the nature of convergence are obtained. A two-tailed t-test establishes the statistical significance of the solution data whereas Cohen's d and Hedge's g values provide a measure of effect sizes. We trace the trajectory of the fittest pseudo-agent over all function evaluations to comment on the dynamics of the system and prove that the proposed algorithm is theoretically globally convergent under the assumptions adopted for proofs of other closely-related random search algorithms.Comment: 27 pages, 4 figures, 19 table

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

TFBEST: Dual-Aspect Transformer with Learnable Positional Encoding for Failure Prediction

Author: Mohapatra Rohan
Sengupta Saptarshi
Publication venue
Publication date: 05/09/2023
Field of study

Hard Disk Drive (HDD) failures in datacenters are costly - from catastrophic data loss to a question of goodwill, stakeholders want to avoid it like the plague. An important tool in proactively monitoring against HDD failure is timely estimation of the Remaining Useful Life (RUL). To this end, the Self-Monitoring, Analysis and Reporting Technology employed within HDDs (S.M.A.R.T.) provide critical logs for long-term maintenance of the security and dependability of these essential data storage devices. Data-driven predictive models in the past have used these S.M.A.R.T. logs and CNN/RNN based architectures heavily. However, they have suffered significantly in providing a confidence interval around the predicted RUL values as well as in processing very long sequences of logs. In addition, some of these approaches, such as those based on LSTMs, are inherently slow to train and have tedious feature engineering overheads. To overcome these challenges, in this work we propose a novel transformer architecture - a Temporal-fusion Bi-encoder Self-attention Transformer (TFBEST) for predicting failures in hard-drives. It is an encoder-decoder based deep learning technique that enhances the context gained from understanding health statistics sequences and predicts a sequence of the number of days remaining before a disk potentially fails. In this paper, we also provide a novel confidence margin statistic that can help manufacturers replace a hard-drive within a time frame. Experiments on Seagate HDD data show that our method significantly outperforms the state-of-the-art RUL prediction methods during testing over the exhaustive 10-year data from Backblaze (2013-present). Although validated on HDD failure prediction, the TFBEST architecture is well-suited for other prognostics applications and may be adapted for allied regression problems.Comment: 9 pages, 6 figures, 2 table

arXiv.org e-Print Archive

Superfluid-Insulator transition of two-species bosons with spin-orbit coupling

Author: Mandal Saptarshi
Saha Kush
Sengupta K.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2012
Field of study

Motivated by recent experiments [Y.J. Lin {\it et al.}, Nature {\bf 471}, 83 (2011)], we study Mott phases and superfluid-insulator (SI) transitions of two-species ultracold bosonic atoms in a two-dimensional square optical lattice with nearest neighbor hopping amplitude

t

in the presence of a spin-orbit coupling characterized by a tunable strength

\gamma

. Using both strong-coupling expansion and Gutzwiller mean-field theory, we chart out the phase diagrams of the bosons in the presence of such spin-orbit interaction. We compute the momentum distribution of the bosons in the Mott phase near the SI transition point and show that it displays precursor peaks whose position in the Brillouin zone can be varied by tuning

\gamma

. Our analysis of the critical theory of the transition unravels the presence of unconventional quantum critical points at

t/\gamma=0

which are accompanied by emergence of an additional gapless mode in the critical region. We also study the superfluid phases of the bosons near the SI transition using a Gutzwiller mean-field theory which reveals the existence of a twisted superfluid phase with an anisotropic twist angle which depends on

\gamma

. Finally, we compute the collective modes of the bosons and point out the presence of reentrant SI transitions as a function of

\gamma

for non-zero

t

. We propose experiments to test our theory.Comment: v2, 13 pages, 9 figs; new section and fig

arXiv.org e-Print Archive

Large-scale End-of-Life Prediction of Hard Disks in Distributed Datacenters

Author: Coursey Austin
Mohapatra Rohan
Sengupta Saptarshi
Publication venue
Publication date: 20/03/2023
Field of study

On a daily basis, data centers process huge volumes of data backed by the proliferation of inexpensive hard disks. Data stored in these disks serve a range of critical functional needs from financial, and healthcare to aerospace. As such, premature disk failure and consequent loss of data can be catastrophic. To mitigate the risk of failures, cloud storage providers perform condition-based monitoring and replace hard disks before they fail. By estimating the remaining useful life of hard disk drives, one can predict the time-to-failure of a particular device and replace it at the right time, ensuring maximum utilization whilst reducing operational costs. In this work, large-scale predictive analyses are performed using severely skewed health statistics data by incorporating customized feature engineering and a suite of sequence learners. Past work suggests using LSTMs as an excellent approach to predicting remaining useful life. To this end, we present an encoder-decoder LSTM model where the context gained from understanding health statistics sequences aid in predicting an output sequence of the number of days remaining before a disk potentially fails. The models developed in this work are trained and tested across an exhaustive set of all of the 10 years of S.M.A.R.T. health data in circulation from Backblaze and on a wide variety of disk instances. It closes the knowledge gap on what full-scale training achieves on thousands of devices and advances the state-of-the-art by providing tangible metrics for evaluation and generalization for practitioners looking to extend their workflow to all years of health data in circulation across disk manufacturers. The encoder-decoder LSTM posted an RMSE of 0.83 during training and 0.86 during testing over the exhaustive 10 year data while being able to generalize competitively over other drives from the Seagate family.Comment: 8 pages, 9 figures and 6 table

arXiv.org e-Print Archive